What modes of action define participation in Gravity Spy?
What modes of action distinguish promoted contributors and those who remain at the same level?
What is the relationship between activities and performance?
Do volunteers maintain certain routines?
Motifs were built in chunks of five such that we would capture overlapping activities in motifs.
To reduce the number of motifs, we combined the number similar activities into a single motif such that five learning activities in a pattern of interactions would be represented as l5 and 12 learning interactions would be represented as l12.
For motifs, we used descriptive
statistics to understand common and uncommon modes of action (motif),
and whether a motif was rare as measured by term frequency inverse
document frequency (tfidf)
term frequency (tf) - how frequently a word occurs in a document. inverse document frequency (idf) which decreases the weight for commonly used words and increases the weight for words that are not used very much in a collection of documents. tf-idf (the two quantities multiplied together) indicated the frequency of a term adjusted for how rarely it is used.
To follow the analogy - a document is one of the grouping factors (e.g. level, user, promoted/not promted) and a term is a motif. In this analysis, we are only concerned with the series of activities in the set of five and not the order of those activities.
## Joining, by = "level"
datatable(freq_by_rank)## Warning in instance$preRenderHook(instance): It seems your data is too big
## for client-side DataTables. You may consider server-side processing: https://
## rstudio.github.io/DT/server.html
INSIGHTS
- The most common interactions in terms of the total number of occasions in which the set of five are in level 4, the combination of {'c-chigh', 'break1', 'c-bmed'} occurs 4,463 times while the next most frequent set of five interactions is {'break1', 'c-bmed'} which occurs 2,197 occasions
- The most common set of five actions consist of:
- Level 1: {'l1', 'c-alow', 'c-bmed'} N = 810
- Level 2: {'l1', 'c-alow', 'c-bmed'} N = 465
- Level 3: {'c-chigh', 'break1', 'c-bmed'} N = 1193
- Level 4: {'c-chigh', 'break1', 'c-bmed'} N = 4463
- These motifs reveal several interesting patterns, most notably sessions of classifying and then breaks appear to dominate later levels while learning is an centeral component of activities in levels 1 and 2. The motif (order preserved) for user 65 reveals a typical pattern of activities l1,c-alow,l1,c-bmed,l1. We observe user 64 shifting between learning, classifying, learning, and classifying, and learning again. This particular sequence was observed in level 2 on 56 occassions, by 48 volunteers.
datatable(project_tf_idf)## Warning in instance$preRenderHook(instance): It seems your data is too big
## for client-side DataTables. You may consider server-side processing: https://
## rstudio.github.io/DT/server.html
project_tf_idf.vizINSIGHTS
- idf and thus tf-idf are zero for these extremely common motifs. The inverse document frequency will be a higher number for motifs that occur in fewer of the levels in the collection. So the motifs we reported in the previous section (popular modes of action) are likely to have low idf and thus low tfidf.
- The figure reports what are uncommon across the project, but have some importance in each level. It should be noted that many of the motifs in Levels 1 and 2 include a stop activity (meaning the user dropped out). Also, we notice many of the motifs at this stage in L1 and L2 do not include activities of socializing.
datatable(user_summary_sets)user_summary_sets_viz